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Abstract 

A sampling method for spin systems is presented. The spin lattice is 
written as the union of a nested sequence of sublattices, all but the last 
with conditionally independent spins, which are sampled in succession 
using their marginals. The marginals are computed concurrently by a 
fast algorithm; errors in the evaluation of the marginals are offset by 
weights. There are no Markov chains and each sample is independent of 
the previous ones; the cost of a sample is proportional to the number of 
spins (but the number of samples needed for good statistics may grow 
with array size). The examples include the Edwards- Anderson spin glass 
in three dimensions. 
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1 Introduction. 

Monte Carlo sampling in physics is synonymous with Markov chain Monte Carlo 
(MCMC) for good reasons which are too well-known to need repeating (see e.g. 
I2],[IZ])- Yet there are problems where the free energy landscape exhibits multi- 
ple minima and the number of MCMC steps needed to produce an independent 
sample is huge; this happens for example in spin glass models (see e.g. [13], [IS])- 
It is therefore worthwhile to consider alternatives, and the purpose of the present 
paper is to propose one. 

An overview of the proposal is as follows: Consider a set of variables ( "spins" ) 
located at the nodes of a lattice L with a probability density P that one wishes 
to sample. Suppose one can construct a nested sequences of subsets Lq 3 

D • • • D Ln with the following properties: Lq = L] Ln contains few points; 
the marginal density of the variables in each Li is known, and given values of 
the spins in Li+i, the remaining variables in Li are independent. Then the 
following is an effective sampling strategy for the spins in L: First sample the 
spins in L„ so that each configuration is sampled with a frequency equal to its 
probability by first listing all the states of the spins in L„ and calculating their 
probabilities. Then sample the variables in each Li-i as i decreases from n — 1 
to zero using the independence of these variables, making sure that each state 
is visited with a frequency equal to its probability. Each state oi L = Lq is then 
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also sampled with a frequency equal to its probability, achieving importance 
sampling; the cost of each sample of L is proportional to the number of spins 
in L, and two successive samples are independent. This can be done exactly in 
a few uninteresting cases (for example in the one-dimensional Ising model, see 
e.g. [H]), but we will show that it can often be done approximately. The errors 
that come from the approximation can then be compensated for through the 
use of sampling weights. The examples shown below include spin glass models. 

The heart of the construction is the fast evaluation of marginals, of which 
a previous version was presented in [3; this is the subject of section 2. An 
example of the nested sets needed in the construction above is presented in sec- 
tion 3. The construction is in general approximate, and the resulting errors are 
to be compensated for by weights, whose calculation is explained in section 4; 
efficiency demands a balance between the accuracy of the marginalization and 
the variability of the weights, as explained in sections 4 and 6. The fast evalu- 
ation of marginals requires a sampling algorithm but the sampling algorithm is 
defined only once the marginalization is in place; this conundrum is resolved by 
an iteration which is presented in section 5; two iteration steps turn out to be 
sufficient. 

The algorithm is applied to the two-dimensional Ising model in section 6; this 
is a check of self-consistency. Aspects of the Edwards- Anderson (EA) spin glass 
model three dimensions are discussed in section 7. Extensions and conclusions 
are presented in a concluding section. 

As far as I know, there is no previous published work on Monte Carlo without 
chains for spin systems. The construction of marginals explained below and in 
the earlier paper [9] is a renormalization in the sense of Kadanoff (more precisely, 
a decimation) [15j ; renormalization has been used by many authors as a tool for 
speeding up MCMC, see e.g. [5],[3],[I1] . A construction conceptually related 
to the one here and also based on marginalization, but with a Markov chain, 
was presented in [24j . An alternative construction of marginals can be found in 
PU] , There is some kinship between the construction here and the decimation 
and message passing constructions in [7],[T2]. The specific connection between 
marginalization and conditional expectation used here originated in work on 
system reduction in the framework of optimal prediction, see [TU].[TT]. 

The results in this paper are preliminary in the sense that the marginalization 
is performed in the simplest way I could imagine; more sophisticated versions 
are suggested in the results and conclusion sections. Real-space renormalization 
or decimation and the evaluation of marginals are one and the same, and the 
present work could be written in either physics or probability language; it is the 
second option that has been adopted. 

All the examples below arc of spin systems with two- valued spins and near- 
neighbor interactions in either a square lattice in two dimensions or a cubic lat- 
tice in three dimensions. The Hamiltonians have the form H = ^i,j,k Ji,j,k,iSi' ,j\k' 
(with one subscript less in two dimensions), where the summation ^ is over 
near neighbors. In the Ising case the Jij^k,e are independent of the indices, in 
the spin glass case Jij,k,i are independent random variables; the index £ labels 
the direction of the interaction. 



2 



2 Fast evaluation of approximate marginals. 



In this section we present an algorithm for the evaluation of marginals, which 
is an updated version of the algorithm presented in [9]. For simplicity, we 
assume in this section, as in ^ , that the spins live on a two-dimensional lattice 
with periodic boundary conditions (the generalization to three dimensions is 
straightforward except for the geometrical issues discussed in section 3). We 
show how to go from Lq, the set of spins at the points («, j) of a regular lattice 
whose probability density is known, to Li, set of spins at the points such that 
{i + j) is even, and whose marginal is sought. The probability density function 
(pdf) of the variables in Li can be written in the form Pg = e^* /Z, where 
W(o) = -(3H, 13 is the inverse temperature, Z is the normalization constant, 
and the Hamiltonian H has the form specificed in the introduction. To simplify 
notations, write Ji^j^i — —PJijj and then drop the tildes, so that 

^(0) = ^ . {Jt,j,lSi+i^j + Jij\2Sij + l) . (1) 

The Si J take the values ±1. 

Let the marginal density of the spins in Li be Pi. One can always write 
Pi = ^ /Z, where Z is the same constant as in the pdf of Po- Call the set of 
spins in Li "S"' , and the set of spins in Lq but not in Li "S"' , so that S — S^ S 
is the set of spins in Lq. By definition of a marginal, 

(1) ^ Y^gM/C'CS) 



or 



W^W=log^e^'"(^), (2) 
s 

where the summation is over all the values of the spins in S. Extend the range of 
values of the spins in S (but not S) to the interval [0,1] as continuous variables 
but leave the expression for the Hamiltonian unchanged (this device is due to 
Okunev [?] and replaces the more awkward construction in [3]). Differentiate 
equation Q with respect to one of the a newly continuous variables s = Sij 
(we omit the indices to make the formulas easier to read): 



_ _ _ds 

ds " x;se'^""^^^ 

or 



ds 



ds 



(3) 



where E[- \ S] denotes a conditional expectation given S. A conditional expec- 
tation given S is an orthogonal projection onto the space of functions of S, and 
we approximate it by projecting onto the span of a finite basis of functions of 
S. 
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Before carrying out this projection, one should note the following property 
of = ^I^: take two groups of spins distant from each other in space, say 

5*1 and S2 ■ The variables in these groups should be approximately independent 
of each other, so that their joint pdf is approximately the product of their 
separate pdfs. The logarithm of their joint pdf is approximately the sum of 
the logarithms of their separate pdfs, and the derivative of that logarithm with 
respect to a variable in Si should not be a function of the variables in S2 ■ As 
a result, if one expands at s = s^j, one needs only to project on a set of 
functions of Sij and of a few neighbors of the point (i, j). It is this observation 
that makes the algorithm in the present section effective, and it is implied in 
the Kadanoff construction of a renormalized Hamiltonian, see e.g. [15]. 

As a basis on which to project, consider, following Kadanoff, the polynomials 
in S of the form: tjjp.q = X); j ^i,j^i+p,3+q ^^r various values of p, g, as well as 
polynomials of higher degree in the variables S. Define "fpp q = d''Pp,q/dsij; 
the functions ip'p ^ involve only near neighbors of Sij (for example, if ipi_i = 
J2e,k Si,kSi+i,k+i, then ip[ -^ = 2(si+i j+i + Si„i (no summation). Write the 
approximate conditional expectation of W^^^ as a sum: 

|^]=5]ap,,^;,^ + ..- . (4) 

Each function -ipp^q embodies an interaction, or linkage, between spins (p, q) 
apart, and this is an expansion in "successive linkages" . The functions W'^^^ , W^^^ , 
are invariant under the global symmetry s — > — s, and only polynomials hav- 
ing this symmetry need to be considered (but see the discussion of symmetry 
breaking in section 6). For the reasons stated above, this series should converge 
rapidly as p, q increase. Evaluate the coefficients in by orthogonal projection 
onto the span of the ip'p,q- This produces one equation per point (i,j) (unless 
the system is translation invariant, like the Ising model, in which case all these 
equations are translates of each other). Assume furthermore that one has an 
algorithm for sampling the pdf Pq = e^^°^ jZ (this is not trivial as the goal 
of the whole exercise is to find good ways to sample Pq; see section 5). The 
projection can then be carried out by the usual method: reindex the basis func- 
tions with a single integer, so that they become V'ii'02, • ■ say = Vi,! etc.; 
at each point (i,j) estimate by Monte Carlo the entries ap q = E[^pij)q\ of a 
matrix A, and the entries bp ~ E\W'''^'^ Vp] of a vector 6, where E[-] denotes an 
expectation. The projection we want is X^^pV'pj where the coefficients Op are 
the entries of the vector A~^h (see e.g. [TT]). In the current paper we use only 
the simplest basis with tAi^i, i/ii^-i, i/j-i^i, i/j-i^-i (functions such as ?Ao,i or ipi^o 
do not appear because they involve spins not in 5*) . In three dimensions also we 
use basis functions of the form ^ SijkSej'k' where {i' k') is a near neighbor 
of k) on a reduced lattice. The locality of the functions ijj' make the algo- 
rithm efficient; more elaborate bases for the Ising case can be found in 9 . We 
have not invoked here any translation invariance, in view of later applications to 
spin glasses. For the Hamiltonian ([1]), the quantity W^^") — for s — sij 
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is 

— — + "^i-i-JA^i-iJ + j',2Sij + l + JiJ-1.2SiJ-l. (5) 

The marginal density of the variables in L2 (the set of points (i, j) such that both 
i and j are odd), is obtained by projecting on the basis functions ^^0,2, "00,-2, 
•02,0, "0-2,0, etc. A single sample of all the spins in Lq can be used to generate 
samples of the inner products needed to find the coefficients in the projections on 
all the sublattices; it is not a good idea to use the marginal of Li to evaluate the 
marginal of L2, etc., because this may lead to a catastrophic error accumulation 
[9], [21]; it is the original W'^ = ^^^^ projected in the expansions at the 

different levels. 

The last step is to reconstruct M^^^^ from its derivatives W'^^^ . In the Ising 
(translation invariant) case, this is trivial: W^^^ = ^ cip^p implies W'^^^ = 

Gpipp. In the spin glass case, there is a minor conceptual (thought not prac- 
tical) difficulty. For the various computed functions W^^^ to be derivatives of 
some VF^^^ their cross derivatives must be equal. However, this equality requires 
an equality between coefficients evaluated at diS'erent points (z, j) of the lattice. 
For example, if "0i after renumbering is "0i,i before the renumbering in the no- 
tations above, and ips is ip-i^-i before the renumbering, then the coefficient 
as at the point should equal the coefficient oi at the point (z — l,j — 1), 
and indeed these coefficients describe the same interaction between the spins 
at (i, j) and (z — l,i — 1). However, these coefficients are computed separately 
by approximate computations, and therefore, though closely correlated (with a 
correlation coefficient typically above .95), they are not identical. This is not a 
practical difficulty, because the replacement of one of these coefficients by the 
other, or of both by some convex linear combination of the two, does not mea- 
surably affect the outcome of the calculation. The conceptual issue is resolved 
if one notices that (i) for every lattice Lj, except the smallest one L„, one needs 
the coefficients Op at only half the points, and using the coefficients calculated 
at these points is unambiguous, and (ii) allowing these coefficients to differ is 
the same as writing the Hamiltonian W as (s, Ms) where s is the vector of spins 
and M is an asymmetric matrix. However, the values of W are the same if M 
is replaced by its symmetric part, which means replacing each of the two values 
of a coefficient by the mean of the two values. If this is done everywhere the 
cross derivatives become equal. 

Finally, a practical comment. One may worry about a possible loss of accu- 
racy due to the ill-conditioning of a projection on a non-orthogonal polynomial 
basis. I did develop an approximate Monte Carlo Gram-Schmidt orthogonaliza- 
tion algorithm and compared the resulting projection with what has just been 
described; I could see no difference. 
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3 The nested sublattices. 



In this section we construct nested sequences of sublattices L = Lq,Li,... 
such that the spins in Lj are independent once those in ij+i are determined. 
There is nothing unique about this construction; the nested sequences should 
have appropriate independence and approximation properties while leading to 
efficient programs. 

The two-dimensional case was already discussed in the previous section: 
assume L = is the set of nodes integers. We can choose as the next 

smaller array of spins Li the set of spins at the location with i + j even; 
then the spins in Lq are independent once in Li are known. The next lattice L2 
is the one where i,j are both odd; if the marginal on Li is approximated with 
the four basis functions described in the previous section, then the spins in Li 
are independent one those in L2 are given. From then on the subsets Li can be 
constructed by similarity. If periodic boundary conditions are imposed on Lq, 
they inherited by every successive sublattice. 

If one wants to carry out an expansion in a basis with more polynomials, the 
requirement that the spins in Li be independent once those in Xi+i are known 
places restrictions on the polynomials one can use; for example, the polynomial 
J2 Si,jSi+2,j for i,j such that i+ j is even is a function of only the spins in Li 
but it cannot be used, because the spins at points where i + j is even while each 
of i,j is odd would not be independent once the spins in L2 are known, given 
the linkage created by this added polynomial. This places creates a restriction 
on the accuracy of the various marginals; see also the discussion in section 6. 
However, the point made in the present paper is that significant inaccuracy in 
the marginals can be tolerated. 

The analogous construction in three dimensions is neither so simple nor 
unique. Here is what is done in the present paper: Lq = L consists of all the 
nodes on a regular cubic lattice, i.e., the set of points {i,j, k) where k are 
integers between 1 and N and is a power of 2. 

Li consists of the points(z, j, k) where i+j is even for k odd and odd when k is 
even; the neighbors of (z, j, k) in Li are the 8 points (i, j±l, fc±l), (i±l, j±l, k). 

L2 consists of the points {i.j. k) where i is even, j is odd and k is even. The 
neighbors of a point k) in L2 are the 12 points {i ± 1, j, k ± 1), [i ± 1, ji' ± 
2,/c± 1). 

Lj, consists of the points {i, j,k) where i,j,k are all odd; the neighbors of 
k) are {i ± 2,j, k), ± 2, k), k ± 2). This sublattice is similar to the 
original lattice with the distance between sites increased to 2; the next lattices 
up can then be obtained by similarity. 

This process of constructing sublattices with ever smaller numbers of spins 
stops when one reaches a number of spins small enough to be sampled directly, 
i.e., by listing all the states, evaluating their probabilities, and picking a state 
with a frequency equal to its probability. One has to decide what the smallest 
lattice is; the best one could do in this sequence is a lattice similar to L2 with 
the points {i,j,k) where i — 1 = 2£, j — 1 = 2^, and k = 2(' for an integer £ 
chosen so that there are 16 points in this smallest lattice. Here too each lattice 
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inherits periodic boundary conditions from the original lattice; on the smallest 
lattice one notes that due to periodicity i + 2£ = {i — 2£)mod{N) so that some of 
the neighbors of a points {i,j,k) are not distinct and this must be reflected in 
the evaluation of the last Hamiltonian, or else all the linear systems one solves 
in the projection step are singular. 

The polynomial basis used in this paper, except in the diagnostics sections, 
consists at every level of polynomials of the form = ^i,3,kSi^,j^,km j where 
(i, J, k) is a point in the sublattice L„i and {im,jm, km) is one of its near neigh- 
bors on that sublattice. 



4 Sampling strategy and weights. 

If the marginals whose computation has just been described were exact, the algo- 
rithm outlined in the introduction would be exact. However, the marginals are 
usually only approximate, and one has to take into account the errors in them, 
due both to the use of too few basis functions and to the errors in the numerical 
determination of the projection coefficients. The idea here is to compensate for 
these errors through appropriate weights. 

Suppose one wants to compute the average of a function h{S) of a random 
variable 5, whose pdf is P{x), i.e., compute E[f{S)] = J h{x)P{x)dx. Suppose 
one has no way to sample S but one can sample a nearby variable S'o whose pdf 
is Po{x). One then writes 



/h{x)P(x)dx = [ h(x) — ^^—l-PQ(x)dx 
J Po(x) 



where the Soi are successive samples of S'o , i = 1 , . . . , iVg , and Wi ~ P{Soi) /Po {Soi 
are sampling weights (see e.g. [H]). In our case, P is the true probability den- 
sity e^^°^/Z and Pq is the probability density of the sample 5*0^ produced by 
the algorithm we describe, whose pdf Pq differs from P because the marginals 
used are only approximate. 

The probability Pq of a sample S'o = [sij, ■ ■ ■ ,SAr,Ar) has to be computed 
as the sample is produced: the probability of each state of the spins in L„ is 
known and therefore the probability of the starting sample of L„ is known; each 
time one samples a spin in Lj , j < n, one has choices whose probabilities can be 
computed. As a practical matter, one must keep track not of the probabilities 
themselves but rather of their logs, or else one is undone by numerical underflow. 
Note that in the evaluation of P/Pq the factor remains unknown, but as 
Z is common to all the samples, this does not matter, (and this remark can be 
made into an effective algorithm for evaluating Z and hence the entropy). In 
practice I found it convenient to pick a value for Z so that E[log{P/Po)] = 0. 

In practice, for lattices that are not very small, there is a significant range 
of weights, and there is a danger that the averaging will be dominated by a few 
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large weights, which increase the statistical error. This issue has been discussed 
before (see e.g.[T3) where it is suggested that one resort to "layering"; this is 
indeed what we shall do, but more cautiously than suggested in previous work. 
Suppose one caps all weights at some value W, i.e., replace the weights Wi by 
w'^ = min{wi,W). The effective number of samples in the evaluation of the 
variance is Nw + Y^{wi/w), where Nw is the number of samples with Wi>W, 
and the summation is over the samples with Wi < W. Define the fraction / as 
the fraction of the samples such that Wi > W (so that w^ — W); as W increases 
the fraction / tends to zero. The averages computed by the algorithm here 
depend on / (or W) and so does the statistical error; one has to ascertain that 
any result one claims is independent of /. Typically, as the size of the lattice 
increases, the range of weights increases, and therefore the effective number of 
samples decreases for a given number of samples Ng. What one has to do is 
check that the results converge to a limit as / decreases while the number of 
samples is still large enough for the results to be statistically significant. This 
may require an increase in as N increases. 

5 Bootstrapping the sampling. 

So far it has been assumed that one can sample the density ' /Z well enough 
to compute the coefficient in the Kadanoff expansion ^ of the marginals. How- 
ever, these coefficients are needed to make the sampling efficient when it would 
not otherwise be so, and the sampling has to be "bootstrapped" by iteration so 
it can be used to determine its own coefficients. 

First, make a guess about the coefficients in (|4]), say, set afj^ = af'™'" for 
the ^— the coefficient at the point i,j in the sublattice Lm, where the numbers 
^£,m,o g^^g some plausible guesses. My experience is that it does not much matter 
what these guesses are; I typically picked them to be some moderate constant 
independent of i,j,m,£. Use these coefficients in a sampling procedure to find 
new coefficients af'™'^, and repeat as necessary. An iteration of this kind was 
discussed in [5] , where it was shown that as the number of samples increases the 
error in the evaluation can be surprisingly small and that the optimal number 
of polynomials to use for a given overall accuracy depends on the number of 
samples. In the present work I found by numerical experiment that convergence 
is faster if, after one evaluates a new coefficient afj^'^'^^, one sets in the next 

round af"^'^^^ — {afj^'^ + a^'"'''^^^)/2. I found experimentally that there is no 
advantage in computing these coefficients very accurately, indeed a relatively 
small number of samples is sufficient for each iteration, and two iterations have 
been sufficient for all the runs below. 

6 Example 1: The two-dimensional Ising model. 

To check the algorithm and gauge its performance, we begin by applying it to 
the two-dimensional Ising model. I did not write a special program for this case 
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and did not take advantage of the simplifications which arise when the couphng 
constants in the Hamiltonians are independent of location and one could replace 
four basis functions by the single function consisting of their sum, and the single 
expansion coefficient is the same at all points so that the projection can be 
averaged in space as well as over samples. Once expansion coefficients have 
been determined, they can be used to generate as many samples as one wants. 

First, I computed the mean magnetization E[ij], where /x = -^J^^i-j^ 1 — 
hj < N, as a, function of the temperature T. To calculate such means near T = 
Tc one needs a way to break the symmetry of the problem; this is usually done 
by adding a small asymmetric term e ^ sij to the Hamiltonian, for example 
with e = eo/N, eq ~ 0.2. Adding such a field here works for small A^, but as 
N increases, a value of eq small enough not to disturb the final result may not 
suffice to bias the smallest lattice L„ in one direction. The remedy is to assign 
positive weights to only to those spin configurations in L„ (where weights are 
explicitly known) such that — ^■ 

It may be tempting to introduce a symmetry breaker into the initial Hamil- 
tonian W^°^ , add terms odd in the Sij to the series of linkages, and attempt 
to compute appropriate symmetry-breaking terms for the smaller lattices by a 
construction like the one above. This is not a good idea. The longer series 
is expensive to use and the computation of higher Hamiltonians is unstable to 
asymmetric perturbations, generating unnecessary errors. 

In Table 1 we present numerical results for T = 2.2 and different values of N 
compared with values obtained by Metropolis sampling with many sweeps of the 
lattice. Ns = 1000 samples were used in each of two iterations to calculate the 
approximate marginals; once these are found one can inexpensively generate as 
many samples of of the spins as wanted; here 1000 were used. The Table exhibits 
the dependence of the computed E[iJ,] on the fraction / of weights which have 
been capped; the statistical error in the estimates of E[^] grows as / decreases, 
but more slowly than one would expect. For TV = 16, 32, 64 the results converge 
as / ^ before the statistical error becomes large, but not when N = 128, and 
one should conclude that this value of N is too large for the present algorithm 
with so small a basis in the computation of marginals. Even with N = 128 
one obtains a reasonable average {E[fi] — .80) if one is willing to use enough 
samples. Note also that the weights become large, and the calculations must be 
performed in double precision. 

We now turn to the determination of the critical temperature Tc- This can 
be obtained from the intersection of the graphs of E[fj] vs. T for various values 
of N (see e.g. [II],[IS]); it is more instructive here to apply the construction in 
[9] , based on the fact that if one expands the "renormalized" Hamiltonians in 
successive linkages, ie., if one finds the functions W^''^ such that the marginals 
on L, are exp(W^('))/Z, using the series (jj]), then the coefficients in the series 
increase when T < Tc and decrease when T > T^- For this construction to 
work, one needs enough polynomials for convergence, i.e., so that the addition 
of more polynomials leaves the calculation unchanged. In the present case, this 
is achieved with the following 7 polynomials: 'ipi,'ip2,''p3, '04 = Si.jSi±ij±i (as 

above), -tps = I]Sjj(si+2j + Sij+2 + Si-2,j + Sij-2),1p6 = = 
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Table 1 



Ising magnetization at T = 2.2 



size of 


no. of samples 




f 

J 


E\u\ 


metropolis 


array 


Ns 










16 X 16 


1000 


2 


.33 


.74±0.01 


.01 ±.01 






4 


.08 


.80±0.01 








6 


.00 


.80±0.01 




32 X 32 


1000 


5 


.17 


.76±0.01 


.81 ±.01 






7 


.10 


.80±0.01 








9 


.04 


.81±0.015 




64 X 64 


1000 


15 


.134 


.74±.01 


.801 ±.001 






20 


.042 


.77±.01 








25 


.009 


.80±.01 








30 


.001 


.80±.015 




128 X 128 


1000 


25 


.074 


.67±.01 


.798 ± .001 






35 


.023 


.70±.01 








45 


.002 


.74±.02 








50 


.001 


.75±.05 






no 


converj 


;ence 







J2 Sijcf j/lOO, where atj = Sj+ij + Sij+i + Sij+i + Sij-i. The use of polyno- 
mials with higher powers of the Sij is essential (see e.g. [4]), and it is the more 
surprising that the approximate marginals calculated without them are already 
able to produce usable samples. The constant divisors in 'ipe,ip7 are there to 
keep all the coefficients within the same order of magnitude. No advantage is 
taken here of the symmetries of the Ising model. In Table 2 I present the sums 
of the coefficients in the expansion as a function of the temperature T for the 
levels i = 2,4,6 (where the lattices are mutually similar) in a N"^ lattice with 

= 16 {N is chosen small for reference in the next section). From Table 2 one 
can readily deduce that 2.25 < Tc < 2.33; for the value of T between these two 
bounds the sum of the coefhcients oscillates as i increases. Taking the average 
of the two bounds (which are not optimal) yields Tc = 2.29 (the exact value is 
Tc = 2.269...). A more careful analysis improves the result and so does a larger 
value of N. All the coefficients have the same sign, except occasionally when a 
coefficient has a very small absolute value. 

For the sake of completeness I plotted in Figure 1 a histogram of the log- 
arithms of the weights Wi for the Ising model with iV = 32 and 10* samples; 
the zero of logw is chosen as described above. The cost per sample of an op- 
timized version of this program for N = 32,64 is competitive with the cost 
of a cluster algorithm P21 and is significantly lower than that of a standard 
Metropolis sampler. It is not claimed that for the Ising model the chainless 
sampler is competitive with a cluster algorithm: as N increases the complexity 
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Table 2 



Sums of coefficients of Kadanoff expansion 
as a function of T for Ising model 



T 


i = 2 


i = 4 


i = 6 


2.20 


1.63 


2.16 


2.71 


2.25 


1.50 


1.77 


1.88 


2.26 


1.49 


1.73 


1.50 


2.28 


1.45 


1.65 


1.40 


2.30 


1.41 


1.50 


1.30 


2.32 


1.36 


1.40 


1.13 


2.33 


1.35 


1.34 


1.15 


2.35 


1.30 


1.27 


0.95 



of the present algorithm grows because one has to add polynomials and/or put 
up with a decrease in the number of effective samples, and one also has to do 
work to examine the convergence as / — > 0. The present sampler is meant to 
be useful when MCMC is slow, as in the next section. 

7 Example 2: The Edwards- Anderson spin glass 
in three dimensions 

We now use the chainless construction to calculate some properties of the EA 
spin glass [T31II11IIS1 and in particular, estimate the critical temperature 

Tc. The three dimensional Edwards- Anderson spin glass model is defined by 
equation ([1]), where the Ji,j,k,i = P^i,j,k,i, the £.i,j,k,i are independent Gaussian 
random variables with mean zero and variance one, and /3 = 1/T is the inverse 
temperature. Periodic boundary conditions are imposed on the edge of the A'^^ 
lattice. 

Let the symbol < ■ >t denotes a thermal average for a given sample of 
the Js, and [-Ja^ denote an averages over the realizations of the Js. Given two 
independent thermal samples of the spins S'1.2 = {s^'j we define their overlap 
to be 9 = A^~'^ J2 slj k^ljk' where the summation is over all sites in the lattice. 
The Binder ratio [3]',[IS]'is g = 0.5(3- [< >t]av/[< >t]L)- The function 
g = g{T) is universal, and the graphs of g as a function of T for various values 
of the lattice size N should intersect at T = Tc- 

The method presented in the present paper is applied to this problem. The 
only additional comment needed is that at every point of the lattice one has 
to invert a matrix generated by a random process involving integers, and oc- 
casionally one of these matrices will be singular or nearly so, and will produce 
unreliable coefficients, particularly for small samples sizes and low temperatures. 
As long as there are few such cases, there is no harm in jettisoning the resulting 
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Figure 1: Histogram of weights for the Ising model, N = 32, 10^ samples. 

coefficients and replacing them by zeroes. 

In Figure 2 I display the results obtained for this problem. The statistical 
error is hard to The numerical parameters are: 2000 realizations of the Js, for 
each one of them 1000 samples for estimating the expansion coefficients and then 
5000 samples for evaluating q and its moments. I used the bound logW — 30, 
which produces a modified fractions / = for = 4, / = 0.015 for N = 8 and 
/ = .05 for N = 16. The statistical error was hard to gauge; one can readily 
estimate the standard deviations of the numerical estimates of [< q'^ >t]av and 
[< q^ >t]av but these estimates are correlated and one therefore cannot use 
their standard deviations to estimate that oi g. I simply made several runs for 
some of these computations and used the scatter of the results to estimate the 
statistical error. I concluded that the statistical error is around 1% for = 4 
and 2 - 3% for = 8, 16. 

These three graphs taken separately approximate the ones in the detailed 
computations of [l6]. The graphs for = 4 and A^ = 16 intersect at T = .93, 
which is the value of Tc deduced from the Binder cumulant computation in 
[TB] (and which differs from the value Tc = .95 deduced in the same paper 
from other considerations and which is likely to be right from the accumulated 
previous wisdom, as reported in that paper). The graph for A^ = 8 is a little off 
from what one may expect, but again previous calculations, for example Figure 
7 in |16j . also display symptoms of unexpected waywardness. If one compares 
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Figure 2: The Binder cumulant g as a function of the temperature T in the 
three-dimensional AE model. 



Figure 2 with Figure 4 of [16,, one sees other small discrepancies; for example, 
the values of 17 I obtained, in particular for iV = 4, are smaller than those in 
[IB] by a small but statistically significant amount; this cannot be the effect of 
"layering" (i.e., the use of a bound W) because for = 4 layering plays no 
role; it is hard to see how it can be produced the statistical error in either paper 
because the sample sizes are certainly large enough. I have no explanation, 
except the general orneriness of the EA spin glass, as illustrated by the widely 
varying results for various exponents and thresholds summarized in [16j . These 
vagaries do not alter the fact that the algorithm of the present paper produces 
worthy results in a very difficult problem. A more detailed exploration of spin 
glasses will be published separately. 

It is of interest in the present context to see how the coefhcients in the 
Kadanoff expansion, used in the previous section to estimate for the Ising 
model, behave in the three-dimensional spin glass. Now one needs more poly- 
nomials (20; the three-dimensional analogs of the ones in the preceding section 
plus others following the same pattern) . The sums of all the coefficients add up 
to small numbers statistically indistinguishable from zero, as one may expect, 
so in Table 3 I display the sums of the absolute values of these coefficients for 
A^ = 16 on self-similar lattices, which in the Ising case are equally able to exhibit 
Tc for this value of N. No bifurcation between growing and decreasing sums 
can be detected near illustrating differences between the phase transitions 
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Table 3 



Sums of absolute values of Kadanoff 
coefficients for EA spin glass model 



T 



= 2 



i = 5 



I 



= 8 



0.6 
0.9 
1.0 
2.00 



3.07 
2.70 
2.43 
6.20 



3.95 
3.90 
3.34 
8.51 



3.73 
3.42 
2.76 
8.87 



in the Ising and spin glass cases. Note the results for T ~ 0.6, a temperature 
hard to reach by a MCMC process. 

The real question here is the efficiency and speed of the algorithm. What are 
needed are timing comparisons between an optimized version of it and optimized 
version of alternatives, such as the parallel tempering construction of [16]. This 
is not available. The least one can say is that the chainless sampler is highly 
competitive with others. Most of the computations in this paper (all but the 
ones for g at A'^ = 16) were ffi'st run on a single serial desktop machine. 

8 Conclusions 

A Monte Carlo sampling technique that relies on a fast marginalization rather 
than a Markov chain has been introduced, tested, and applied to a challenging 
test problem. The results demonstrate that it is a good alternative, especially 
for problems where the free energy has many minima and MCMC algorithms 
may be slow. Various improvements to this constructions readily suggest them- 
selves, based on more polynomials, better polynomials, and renormalization 
schemes other than decimation. Related ideas, such as the parallel marginaliza- 
tion scheme proposed in |24| , are also worth further investigation in the context 
of spin problems. 

The construction of the sequence of lattices above assumed that the origi- 
nal Hamiltonian involves only near-neighbor interactions; the lifting of this re- 
striction requires a more elaborate renormalization process and will be pursued 
elsewhere. 
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